Entry Name:  "ZJU-Xu-MC2"

VAST Challenge 2015
Mini-Challenge 2

 

 

Team Members:


Jin Xu, Zhejiang University, simplecat123@zju.edu.cn PRIMARY

Shuilin Ren, Zhejiang University, shuilinren@foxmail.com

Yubo Tao, Zhejiang University, taoyubo@cad.edu.cn SUPERVISOR

Hai Lin, Zhejiang University, lin@cad.zju.edu.cn SUPERVISOR

 

Student Team: YES

 

Did you use data from both mini-challenges?  No

 

Analytic Tools Used:

D3

Excel

Oracle database

Approximately how many hours were spent working on this submission in total?

We spent about 220 hours on this submission.

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? YES

 

 

Video:

 

ZJU-Xu-MC2.wmv

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Questions

 

MC2.1 – Identify those IDs that stand out for their large volumes of communication.  For each of these IDs

 

      a.      Characterize the communication patterns you see.

      b.      Based on these patterns, what do you hypothesize about these IDs?

 

Limit your response to no more than 4 images and 300 words.

 

Fig. 1-1 the volumes of communication on Friday, Saturday, Sunday

 

In order to identify those IDs that stand out for their large volumes of communication, we put the data in an Oracle database, and accumulate the communication in each hour. As shown in the Fig. 1-1, we see that ID 1278894, 839736, external and the rest top 24.

1.        ID 1278894

By observing communication of 1278894 we see that 1278894 sends messages from 12:00 to 21:00. In the time interval, 1278894 sends messages every 5minutes in an hour and then stop one hour. 1278894 sends large volumes of messages at the same time. Comparing to the number of sending and receiving messages among three days, we discover that 1278894 never sends messages at other time. 1278894’s communication pattern is regular.

As shown in Fig. 1-2, 1278894 sends messages to some persons in two groups (not all of them). We also find that 1278894 stays at Entry Corridor among the three days. In the Dinofun World map, we can see that id 60 is used for daily slab maps and information in the Entry Corridor. So we infer that 1278894 works for telling the daily affairs to visitors around the park and stays in ID 60 location.

2.        ID 839736

839736 stays in Entry Corridor and communication pattern is similar between sending and receiving. 839736 receives large messages between 8:00and 10:00 on Friday and Saturday. But 839736 receives large messages from 11:30 to 12:00 on Sunday ,there might be a vandalism event and then visitors send to 839736 for querying the event .So we can infer that ID 839736 works for consulting.

Fig. 1-2

 

3.        ID external

ID External means the communications between outside and inside, people inside usually like to share the information about the activities or unexpected events. So the time when the external receives large messages might infer the stage show time or vandalism.

Fig.1-3: external communication on Saturday and Sunday

 

4.        Group ID

We identify the top 24 largest IDs from the rest. (ID: 1116329, 1045021, 1250941, 918738, 128533, 1749109, 1427875, 1388162, 49375, 1300247, 970490, 484248, 1508923, 530908,810123, 992045, 1280922, 38622, 174974, 171002, 1692925, 856067, 1410699, 74616).

We find that they have the similar communication patterns. We search from the database and find that they always send large messages simultaneously. And topological relations among them are the same in the three days. They go through the park and send or receive messages from 8:30 to 23:00 each day. eg: ID 1116329 in the figure . We think they are more likely to be staff.

Fig. 1-4

 

MC2.2 – Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.

 

Limit your response to no more than 10 images and 1000 words.

 

According to the plan of Scott’s weekend, there would be two stage shows each on Friday, Saturday and a Scott’s talk on Sunday. In addition, a show of memorabilia would be displayed on each day. Therefore, if there is no vandalism, the visiting patterns each on Friday, Saturday and Sunday should be similar. In the analysis of the features of volumes of communication changing over time and the relationships, all park visitors and staff can be classified. The features can be shown in Fig.2-1

 

Fig.2-1

 

Pattern 1: Visitors got together on Coaster Alley to watch the stage show on Friday and Saturday.

Typical ID: 686348,1821472,1862455,1251090

There are 4 typical clusters on Friday and 11 such typical clusters on Saturday. A cluster shown in Fig.2-2, for example, which has 37 members, 16 leaders, only communicates with the members in the inside cluster, ID 1278894 and ID 839736, arrived at Coaster Alley at 14:47:59, stopped communicating at 14:49:03 and started to communicate again at 16:00:02. There might be a stage show during the period.

Fig.2-2: Pattern 1

 

Pattern 2: Visitors on Sunday didn’t get together on Coaster Alley to watch the talk on Sunday.

Typical ID: 60390, 187692, 773078, 874532

On Sunday, there are no typical clusters like pattern one. While there is a typical cluster of visitors on Sunday, which has 44 members, 19 leads, only communicate with the members in the same cluster, ID 1278894 and ID 839736. For the purpose of attending the Scott’s talk as planned, the cluster arrived at Coaster Alley at 14:37:28, but their leads received messages from ID 1278894 at 14:45:00, which may inform them that the talk is canceled. So they left Coaster Alley at once.

Fig2-3: Pattern 2

 

Pattern 3: Extra security guards appeared on Sunday only

Typical ID: 2047906, 955733, 1038892, 378256

This cluster is the most typical among clusters on Sunday. It has 37 members, 19 leads.  The cluster communicates with the members inside the cluster, and two other persons (1278894 and 839736). This cluster has the largest volumes of communication between 11:00 and 12:00. Cluster members entered the park at Entry Corridor at 9:32:38, and they directly went to Wet Land at 9:41:11 and stayed there until 12:06:49. Creighton Pavilion is in the Wet Land, where would be a show of memorabilia. The news reported that the extra security to the Pavilion would add to ensure visitors’ safety. Therefore, the cluster members are more likely to be the extra security guards. This cluster suddenly sent a lot of messages at 11:32:58 that indicates the vandalism has been discovered.  And this cluster stays at Wet Land until the problem was solved.

 

Fig.2-4: Pattern 3

 

Pattern 4: Staff appears on three days and communicates from morning to night.

Typical ID: 935776,714380,733140,805298

This pattern is about staffs.  In general, staff appears on three days, communicate from morning to night, the volumes of  their communication are relatively stable. As shown in fig2-5, this pattern is similar to staffs.  They might belong to the same group on account of the same roles each day. The two clusters have few associations, which can be easy to understand that they may have different roles in the park and they may also need to exchange information. The two clusters both send messages from each location in the park, which can be inferred that they have to patrol every corner.

 

Fig2-5: Pattern 4

 

Pattern 5: Location suspect

ID:416790,1187909,1502920,1123214,1350546,461004,1000279

1. The cluster stayed at Wet Land all the time on Sunday when the vandalism occurred.

2. They stay other locations for a very short time with the purpose of the avoidance of doubt. They also did nothing at other locations.

3. They suddenly sent a lot of messages at 12:00 which can be inferred that they might be talking about the progress result.

Fig 2-6: Pattern 5

 

Pattern 6: Suspect

ID: 962171,1558676,458709,630410,912123,1948458

1. The cluster continues to communicate all day except the period between 9:31:38 and 11:05:21, and a member send a message at 11:05:21 which indicates that they are at Wet Land. But when the vandalism was discovered at 11:32, they left Wet Land. Visitors may be more inclined to go to the Site of the incident, instead of going away.

2. The members also appeared on Saturday, and their routes are similar on Sunday which they spend almost all day at Wet Land.

3. The cluster can communicate with the staff on Saturday but they only communicate with each other on Sunday.

Fig 2-7: Pattern 6

 

Pattern 7: Suspect

ID: 1279196,1385263,1646340,1892771,484032

The cluster only appeared in the morning, and in the period they spend most time at Wet Land.

Fig 2-8: Pattern 7

 

MC2.3From this data, can you hypothesize when the crime was discovered?  Describe your rationale.

 

Limit your response to no more than 3 images and 300 words. 

 

We can get the information that the crime was discovered at 11:32:58.

We can see the entry to Creighton Pavilion is on the location of Wet Land. So we focus on the Communications in Wet Land. By observing the statistical information based on the following attributes: time, location, communications, we see that the communications trend is similar on Friday and Saturday, but the Sunday’s communications between 11:00 and 12:00 is quite larger than others (Figure 3-1).

Fig. 3-1

 

According to the communications number, we know that ID 839736 who is a consultant staff in the park. Every visitor can ask him/her for help. Seeing the communication both received and send of 839736 (Fig3-2), the communications numbers suddenly become quite large. We infer vandalism information spread around the Wet Land, many people are talking about the event and ask 839736 for making sure the activity will be hold as planned. Analyzing the ID external (Fig3-2), we can also find that people suddenly send large messages to outside between 11:30 and 12:00 at Wet Land. So vandalism should be discovered before this time.

Fig. 3-2

 

We have analyzed the cluster called “extra security guards” who suddenly sent a lot of messages at 11:32:58 at Wet Land that indicates the vandalism has been discovered.  And this cluster stays at Wet Land until the problem was solved at 12:06:49.

Fig. 3-3